Dataviz Makeover 2

Interactive visualisation of inter- and intra- zonal pulib bus flows at the sub-zone level. January 2022.

Caine Ng https://www.linkedin.com/in/caine-ng-069a5273/ (SMU - MITB)https://scis.smu.edu.sg/master-it-business
2022-03-16

1.0 The original visualization

The original interactive visualization shown above could be found here was created using data meticulously prepared and downloaded from LTA Datamall, URA region, planning area and planning subzone meant to visualize the intra- and inter- zonal public ride bus flows in Singapore for the period of Jan 2022.

2. Critiques on Clarity, Aesthetics and Interactivity

The original graphic makes use of bar charts, adjacency matrix and filters for interactivity to show public bus flows for different times of the day. These are some of the issues in terms of clarity, aesthetics and interactivity identified. The critiques and suggestions are listed below:

In terms of clarity:

Item Critique Suggestion
1 Chart title: A chart title is not provided for the graphic and left to reader to interpret what the grahpic is about. Appropiate chart title such as ‘Visualing Public Bus Flows for January 2022’ could be used to give the reader an idea what the chart is about.
2 Use of Adjacency Matrix: The concept of adajcency matrix may not be familiar to many readers and may not be aware of what the graphic is trying to show. Alternative graphic supplemented with interactivity could be used instead such that readers would intuitively recognise the links between origin and destination. For instance by configuring the graphic such at click a source data would highlight destination data would be more intuitive.
3 Compress Adjacency Matrix: Due to size of the graphic, the adajcency matrix used to indicate links between origin and destination subzones is compressed and unreadable unless mouse hovers over the plot to show the tooltip. The large amounts of information and numberous individual level leads to a very sparse adjacency matrix and limited space given to it creates a compressed graphic. Alternative visualization using interactivity could be used instead to represent the link between origin and destination with filters to reduce amount of data shown at one time.
4 User Guide: A short user guide is not provided for readers to understand how the current interactivity features work. A brief description and guide could be provided such at readers would be able to appreciate fully what the visualization is about.

In terms of aesthetics:

Item Critique Suggestion
1 Not using color: While it is preferred that non-data ink be reduced in a visualization, the used of single color for all the bar graphs required readers to read the title to interpret difference between each chart.. Appropriate colors could be used to indicate the different data that is shown in the graphic. For instance, blue could be used for origin graphics and a different color such as red for destination data would allow readers to intuitively separate the infromation.
2 Use of Multiple graphs: Separate graphs are used to show each facet of the source, destination, weekday and weekend bus trip frequency, requiring the reader to compare multiple graphs at a time. With interactivity and filter some of the graphics could be combined together and only shown up when required as the reader explores the data. With focus on the links between source and destination, hourly frequency could be shown in a tooltip as the reader hovers over the data. The reader can then filter accordingly based on information shown in the tooltip.
3 Graph Size: The use of multiple graphs does not readily allow all the graphs to be shown in one page, causing the reader to scroll between graphs in order to make a comparison. By combining the graphics together and using tooltips to show parts of the information, the graphs could be show in one page and allow for easier comparison of the information shown.

In terms of interactivity:

Item Critique Suggestion
1 Separate Filters: Separate filters are used for source sub zone and destination sub zone. Presented separately would allow the user to compare relative frequency of bustrip at two separate sub zones. However contextually the graphic is meant to show flows of buses between sub zones, the filters could be linked such that graph of a destination show only amount of bus trips from a source instead. Filters between the source and destination graphic could be linked, such that filtering to one source would show amount of trips to different locations. The revervse could also be done to show number of bus trips from different sources to a destination.
2 Use of Tooltips: Due to compressed axis labels of the adjacency plot, tooltips were used to help show information in the adjacency plot, however it could be better used to reduce amount of graphics shown. Instead tooltips could be used to show other information that is not presented on the graphic or make add additional details. An example would be to show the bar graphs between source and destination in the tooltip, reducing the graphs shown and could allow the adajcency matrix to be shown fully.
3 Not making use of Filters: Currently filters are used for selecting the source and destination sub-zones presented. Filters could instead be used to reduce the amount of information shown and only when the user wishes to see it. Instead, filters could be used to either reduce the amount of information shown. For instance, it could be used to show one source sub zone with the corresponding destination subzones, possibly making the adjacency plot more readable.

3. Proposed Design

3.1 Considersations

The original intention of the visualization is to show intra- and inter zonal public bus flow. With that in mind the graph should show correspondingly connection between source and destination.

The use of interactivity would allow several of the graphics to be combined together and be shown only when required. With focus on the bus flows, filters could be used to narrow the data to specific hours of the day and show specific source of trips and their respective destination where buses are going. By linking filters between data sources, filtering the graphic to one destination should show where trips are coming from.

With tooltips the frequency of trips at different hours of the day could be shown when the user is hovering over a specific point and allow the reader to follow up by filtering the graphic to an hour of the day or source sub zone that he or she is interested.

3.2 Proposed Sketch

The data representing sub-zone locations around the country, presents the option of visualizing the data in the form of a map, allowing readers to see the relative positive of different localities relative to each other. The proposed makeover consist of a pair Singapore maps, with one showing location of of sources and another the destinations. Two maps are used to allow for comparison between the origins and destinations. Separate maps are used instead of one map with both origin and destination together as the same sub zone could be both a source and destination leading to overlapping of data and cluttering of data points on the map.

Different colors would be used for origin and destination graphics.

Frequency of bus trips at a single hour of the day are represented by the size of points on the map. Larger dots would indicate more bus trips to or from a location and the relative size would allow readers to see the difference in frequency.

The points on the map would also be configured as filters and linked between origin and destination such that when users click on a specific origin or destination the related other locations would be contextually filtered as well.

Filters are used to narrow down the data to a specific hour and either weekday or weekend bus trip data. This allows reader to focus on specific time periods of interest and with animation between transitions from one state to another, the reader could also see the difference between each state.

The overall frequency throughout the day on both weekend and weekdays would be shown in a tooltip as the user hovers over a specific locations. Giving readers an overview of the location, where the user could then adjust the filters based on information seem from the tooltip.

Relevant titles and brief users guide would be added to allow readers to understand what the graphic is about and what options are avialable to the reader.

3.2 Advantage of proposed design

The advantages of proposed design:

In terms of clarity

In terms of aesthetics

In terms of interactivity

4. Data Visualization Process

Data for this makeover has been provided thus no editing or wrangling was done prior to use and following steps focuses on documenting the steps used to generate the visualization. For this purpose collated bus trip numbers from separate trip to destination sub zone was provided in origin_destination_bus__SZ_202201.csv file and to generate a map collection of shapefiles from MP14_SUBZONE_WEB_PL were used in the steps below.

The final visualization consist of a pair of maps, with tooltips to show bus trip frequency. Data is first loaded into tableau. Subsequently separately thematic map and bar graph of source and destination are created first before combining all graphics into a dashboard.

4.1 Loading Data into Tableau

The first part would be to load required data for generating the visualization into tableau.

Item Steps Images
1 Drag and drop origin_destination_bus__SZ_202201.csv into Tableau startup interface to load bus trip data.
2 Click on ‘Add’ and select spatial file. In the windows explorer pop up select the ‘MP14_SUBZONE_WEB_PL.shp’ file from the directory where the file is stored to add shapefile data into tableau.
3 By default origin_destination_bus__SZ_202201.csv should appear in the data pane. Click on MP14_SUBZONE_WEB_PL in the connections pane, then drag and drop the .shp under the files pane to the data pane. .
4 Configure the relation between origin_destination_bus__SZ_202201.csv and MP14_SUBZONE_WEB_PL by selelct ‘Origin Sz’ column from origin_destination_bus__SZ_202201 and ’Subzone N column from MP14_SUBZONE_WEB_PL.
5 To avoid confusion later, right click the origin_destination_bus__SZ_202201 block in the data pane and select rename to rename it as ‘origin_sheet’.
6 Next select origin_destination_bus__SZ_202201.csv in the connections pane then drag and drop origin_destination_bus__SZ_202201.csv into the data pane to have another copy of the data inside the data pane.
7 Configure the relation between MP14_SUBZONE_WEB_PL.shp and the newly added origin_destination_bus__SZ_202201.csv sheet by selecting ‘Subzone N’ column for MP14_SUBZONE_WEB_PL.shp and ’Destination_SZ for origin_destination_bus__SZ_202201.csv sheet.
8 right click the newly added origin_destination_bus__SZ_202201 block in the data pane and select rename to rename it as ‘destination_sheet’.
0 At this point required data for creating the visualizations is loaded into tableau as shown in the image.

4.2 Source Thematic Map

The next part would be to create a thematic map representing number of trips from different source locations:

Item Steps Images
1 Click on Sheet1 to start creating the visualization.
2 Drag and drop ‘Geometry’ pill under MP14_SUBZONE_WEB_PL table to the main pane to add a map of Singpaore to the graph.
3 Right click Time Per Hour under origin_sheet and select ‘convert to dimension’. .
4 Drag and drop the Time Per Hour to the filters pane, select ‘All’ then click okay.
5 Right click on the Time Per Hour pill in filters pane and select show filter to add a filter to the graphic.
6 Click on the small triangle in the Time per hour filters pane and select ‘Single Value Slider’ to change its appearance to a slider.
7 Drag and drop the Day Type under origin sheet table to the filters pane, select ‘All’ then click okay.
8 Right click on the Day Type pill in filters pane and select show filter to add a filter to the graphic.
9 Click on the small triangle in the Day Type filter pane and select ‘Single Value list’ to change its appearance to a slider.
10 Drag and drop the ‘Origin Sz’ pill and Total Trips pill to details shelf and size shelf respectively.
11 In the marks tap change the option from ‘Automatric’ to ‘Circles’.
12 Right click the title and select edit title. Change the title to ‘Bus Trip Origin’. Change to bold font and color to blue.
13 Click on the thumbnail option that will show up when hovering the mouse on the top left corner of the map to fix the position of the map.
14 Right the sheet name at the buttom of the sheet to rename the sheet as ‘Origin’.
15 At this point the result is a thematic map showing relative number of trips from each source subzone that could be filtered by hours of the day and weekday or weekday.

4.3 Destination Thematic Map

The next part would be to create a thematic map representing number of trips to different destination locations. Most of the steps are similar to previous the sectir, difference are from where columns are selected from. For this graph data is select from the destination_sheet instead of origin_sheet.:

Item Steps Images
1 Start by adding a new sheet by clicking ‘new worksheet’.
2 Drag and drop ‘Geometry’ pill under MP14_SUBZONE_WEB_PL table to the main pane to add a map of Singaore to the graph.
3 Right click Time Per Hour under destination_sheet and select ‘convert to dimension’. .
4 Drag and drop the Time Per Hour under destination_sheet to the filters pane, select ‘All’ then click okay.
5 Right click on the Time Per Hour pill in filters pane and select show filter to add a filter to the graphic.
6 Click on the small triangle in the Time per hour filters pane and select ‘Single Value Slider’ to change its appearance to a slider.
7 Drag and drop the Day Type under destination_sheet table to the filters pane, select ‘All’ then click okay.
8 Right click on the Day TYpe pill in filters pane and select show filter to add a filter to the graphic.
9 Click on the small triangle in the Day Type filter pane and select ‘Single Value list’ to change its appearance to a slider.
10 Drag and drop the ‘Destination Sz’ pill and Total Trips pill from under under destination_sheet to details shelf and size shelf respectively.
11 In the marks tap change the option from ‘Automatric’ to ‘Circles’.
12 Click on colors in the marks pane and select red color to change color of the marks to red.
13 Right click the title and select edit title. Change the title to ‘Bus Trip Destination’. Change to bold font and color to blue.
14 Click on the thumbnail option that will show up when hovering the mouse on the top left corner of the map to fix the position of the map.
15 Right the sheet name at the buttom of the sheet to rename the sheet as ‘Destination’.
16 At this point the result is a thematic map show relative number of trips to each sub-zone that could be filtered by hour of the day and weekday or weekend trips.

4.3 Source Frequency Bar Graph

Next bar graphs showing the frequency of bus trips from a sub zone will be generated using the source_sheet data.:

Item Steps Images
1 Start by adding a new sheet by clicking ‘new worksheet’ icon at the bottom of the window.
2 Drag and drop ‘Origin Sz’ pill from under origin_sheet.csv table to the filters pane. In the pop up menu select All then click okay.
3 Right click on the Origin Sz pill in filters pane and select show filter to add a filter to the graphic.
4 Click on the small triangle in the Origin Sz filters pane and select ‘Single Value down’ to change its appearance to a drop down list. Tentatively select one sub zone to filter data to one subzone.
5 Drag and drop from under origins sheet table ‘Time per hour’ to columns shelf, Day Type to rows shelf and Total Trips to rows shelf.
6 Right click the total trips axis labels and uncheck the show header option to hide the header labels. The adjust the y-axis by hovering the move over the axis until the mouse icon change to width adjustment icon then click and drag the axis to the right to show the full row label.
7 Right click and top X-axis row labels at the top left corner of the worksheet and select hide row field labels to hide them.
8 Right click and top X-axis column labels at the top centre of the worksheet and select hide column field labels to hide them.
9 Right click the chart title and select edit title to change the title to ‘Frequeny of Trips Per Hour’. Configure the title to bold and blue in color.
10 Change the size of the graph to entire view from the view option in the menu bar.
11 Right click the worksheet name and select rename to change the worksheet name to OriginFreq.
12 At this step a bar group for frequency of trips per hour from a origin location is generated.

4.4 Destination Frequency Bar Graph

Next bar graphs showing the frequency of bus trips to a sub zone will be generated using the data from destination_sheet data. Most steps are similar to the steps above with minor differences with which column data is selected for the chart.:

Item Steps Images
1 Start by adding a new sheet by clicking ‘new worksheet’ icon at the bottom of the window.
2 Drag and drop ‘destination Sz’ pill from under Destination_sheet.csv table to the filters pane. In the pop up menu select All then click okay.
3 Right click on the Destinationn Sz pill in filters pane and select show filter to add a filter to the graphic.
4 Click on the small triangle in the Destination Sz filters pane and select ‘Single Value down’ to change its appearance to a drop down list. Tentatively select one sub zone to filter data to one subzone.
5 Drag and drop from under Destination sheet table ‘Time per hour’ to columns shelf, Day Type to rows shelf and Total Trips to rows shelf.
6 Right click the total trips axis labels and uncheck the show header option to hide the header labels. The adjust the y-axis by hovering the move over the axis until the mouse icon change to width adjustment icon then click and drag the axis to the right to show the full row label.
7 Right click and top X-axis row labels at the top left corner of the worksheet and select hide row field labels to hide them.
8 Right click and top X-axis column labels at the top centre of the worksheet and select hide column field labels to hide them.
9 Right click the chart title and select edit title to change the title to ‘Frequeny of Trips Per Hour’. Configure the title to bold and blue in color.
10 Click on colors in the marks shelf and select red to change the graph to red color.
11 Change the size of the graph to entire view from the view option in the menu bar.
12 Right click the worksheet name and select rename to change the worksheet name to DestinationFreq.
13 At this step a bar group for frequency of trips per hour to a destination location is generated.

4.5 Assemble Dashboard and configurating interactivity

In this part the different visualisation are combine together and interactivity is configurated.:

Item Steps Images
1 Start by adding a new dashboard by clicking the ‘New Dashboard’ icon.
2 Drag and drop the ‘Origin’ and ‘Destination’ worksheet into the dashboard pane. Positioning the ‘Destination’ worksheet undernerthe the ‘Origin’ worksheet.
3 Configure marks on the origin graph to be used as a filter by first click on the origin graph and select ‘Use as filter’ in the top right border.
4 To configure the action after filtering, click ‘Actions’ under the dashboard menu.
5 In the follow on pop up actions menu, the first row would be the ‘filter’ action enabled on the graph. Select it and click edit.
6 Configuare the action by checking ‘Origin’ in source sheet, ‘Destination’ in target sheet, setting source field to ‘Origin SZ’ of origin worksheet and target field to ‘Origin SZ’ of destination sheet.
7 Configure marks on the destination graph to be used as a filter by first click on the destination graph and select ‘Use as filter’ in the top right border.
8 Return to the actions menu by clicking ‘Actions’ under the dashboard menu.
9 In the follow on pop up actions menu, a second filter action would appear, select it and click edit.
10 Configuare the action by checking ‘Destination’ in source sheet, ‘Origin’ in target sheet, setting source field to ‘Destination_sz’ of destination sheet and target field to ‘Destination Sz of origin’ sheet.
11 Click on Origin tab to return to the origin worksheet.
12 Click on tooltip in the marks pane to bring up the tooltip options menu. Add a title ‘Frequency of Bus Trips over the day’ in bold, blue font, size 12. Click insert and select OriginFreq to add the origin bar graph to the tooltip. Edit the max width of the plot to 450 to show the full plot. Change the header ‘Origin SZ’ to ‘Origin Sub-zone’ and convert it to blue, bold font. Change the ‘Total Trips’ header to bold blue font as well. Then click okay.
13 Click on Destination tab to return to the Destination worksheet.
14 Click on tooltip in the marks pane to bring up the tooltip options menu. Add a title ‘Frequency of Arrvials over the day’ in bold, blue font, size 12. Click insert and select DestinationFreq to add the origin bar graph to the tooltip. Edit the max width of the plot to 450 to show the full plot. Change the header from ’DESTINATION_SZ (origin_destination_bus__SZ_202201.csv1)’ to ‘Destination Sub-zone’ and convert it to blue, bold font. Change the SUM(TOTAL_TRIPS (origin_destination_bus__SZ_202201.csv1)) header to ‘Total Arrvials’ to bold blue font as well. Then click okay.
15 Click on Dashboard tab to return to the Dashboard.
16 Click on Day Type filter at the top right corner and click in the small triangle to bring up extra options. In the ‘Apply to Worksheets’ options, click ‘Selected Worksheets’. In the follow on menu check ‘Destination’.
17 Click on Time Per Hour filter at the top right corner and click in the small triangle to bring up extra options. In the ‘Apply to Worksheets’ options, click ‘Selected Worksheets’. In the follow on menu check ‘Destination’.
18 Remove duplicate filters ‘Day_Type’ and ‘Time_Per_Hour’ on the lower half of the dashboard by clicking on them and selecting ‘Remove From Dashboard’.
19 Change the legend on the top of the dashboard from an automatic adjustment to displaying a range by clicking on the legend to show extra options. Followed by clicking on the small triangle to and selecing ‘Edit Sizes’. In the follow on menu, select ‘By range’ in the Size vary option. Check Star value for range and end value for range. Change the value for start range to 500 and end value to 250000.
20 Remove the duplicate legend on the lower half of the dashboard by clicking and it, followed by clicking ‘Remove from dashboard’.
21 Check ‘show dashboard title’ to add a title to dashboard.
22 Right click on the title and select edit title. Change the title to ‘Inter and Intra - Zonal Bus Flows’ in blue, bold size 18 font. Add the following caption ‘Explore bus flows between subzones in Singapore for Janurary 2022’ in bold, black, size 12 font underneath the title.

5. Final Visualization

A screenshot of the final visualization is shown below:

Full visualization could be found on tableau public here

7. Main Observations

Listed below are some of the observations drawn from the made over graphic:

Item Obervation Takeaway
1 Overall Increase in participation rate. Comparing the labour participation rate between 2010 to 2021 majority of the age groups show an increase in particition rate over the years. Indicating that more individuals remain work even to later years, which could possibly be attributed to raising cost of living or changes in family roles to support both elders and childern. The working adult age groups of between ages 25 to 54 has increased by 4.4percent to 9.4percent.
2 Biggest increase seen in Age-Groups 65 to 69. The biggest increase in labour force participation rate is seen in 65-69 age group by a difference of 20 percentage points between 2021 and 2010. This is likely driven by changes in employment policy over the years to allow for senior workers to remain in the work force. Increase in recent years could also be related to increase in demand for workers in roles such as security and pandamic related services leading to seniors returning to the work force.
3 Drop in workforce participation in age-group 20 to 24. From 2010 to 2021, the only age group to see a drop in labour participation rate is the 20 to 24 group. Indicated by a drop of -3.4 percentage points. This is likely driven by changes in education demands and more youths seeking degrees before starting their first job.
4 Sharp drop in labour participation rate for youths 15 to 24 in 2020. Both age-groups (15 to 19, 20 to 24) saw a drop in labour participation rate in 2020. This drop was more severe for the 20-24 age group from 61% to 56%. This is likely driven by Covid-19 pandemic restrictions on F&B service which are typically staffed by young part-time workers. It could also be driven by difficult of fresh graduates seeking employement during this period, as companies are cutting cost. However the overall participation rate increase up in 2021 as youths finding alternate employment opportunities.